Feedback-Directed Optimizations with Estimated Edge Profiles from Hardware Event Sampling

نویسندگان

  • Vinodha Ramasamy
  • Robert Hundt
  • Dehao Chen
  • Wenguang Chen
چکیده

Traditional feedback-directed optimization (FDO) uses static instrumentation to collect profiles. This method has shown good application performance gains, but is not commonly used in practice due to the high runtime overhead of profile collection, the tedious dual-compile usage model, and difficulties in generating representative training data sets. In this paper, we show that edge frequency estimates can be successfully constructed with heuristics using profile data collected by sampling of hardware events, incurring low runtime overhead (e.g., less then 2%), and requiring no instrumentation, yet achieving competetive performance gains. Our initial results show a 3-4% performance gain on the SPEC C benchmarks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feedback-Directed Optimizations in GCC with Estimated Edge Profiles from Hardware Event Sampling

Traditional feedback-directed optimization (FDO) in GCC uses static instrumentation to collect edge and value profiles. This method has shown good application performance gains, but is not commonly used in practice due to the high runtime overhead of profile collection, the tedious dual-compile usage model, and difficulties in generating representative training data sets. In this paper, we show...

متن کامل

Using Large Input Sets with Hardware Performance Monitoring for Profile Based Compiler Optimizations

Traditional Profile Guided Optimization (PGO) uses program instrumentation with one or more small training input data sets to generate edge or value profiles to guide compiler optimizations. This approach has been effective in predicting branch directions for many applications. However, for optimizations that are more dependent on the performance characteristics and the accuracy of the profiles...

متن کامل

ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors

Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor’s performance monitoring hardware is an effective, unobtrusive way to obtain detailed profiles. Unfortunately, existing hardware simply counts events, such as cache misses and branch mispredictions, and cannot accurately attribute these events to instructions, especially ...

متن کامل

Adaptive Sampling of Performance Counters

Many applications of profiling based on sampling of Performance Counters (PC), such as feedback-directed optimization and software reliability, are often constrained by the amount of information that can be obtained without perturbing significantly the behavior of the profiled task. Current implementation of event and time based sampling software utilize fixed or random sampling periods which a...

متن کامل

Implementing the render cache and the edge-and-point image on graphics hardware

The render cache and the edge-and-point image (EPI) are techniques that permit high quality rendering at interactive rates of models illuminated with complex ray traced techniques, combining sparse sampling and discontinuities-respecting interpolation. The image reconstruction is decoupled from the samples generation process and permits the use of arbitrary shaders to gather shading samples. Al...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008